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An investigation of the intellectual structure of opinion mining 

research 


Yongjun Zhu. Meen Chul Kim, and Chaomei Chen 


Introduction. Opinion mining has been receiving increasing attention from a broad range of 
scientific communities since early 2000s. The present study aims to systematically investigate the 
intellectual structure of opinion mining research. 

Method. Using topic search, citation expansion, and patent search, we collected 5,596 bibliographic 
records of opinion mining research. Then, intellectual landscapes, emerging trends, and recent 
developments were identified. We also captured domain-level citation trends, subject category 
assignment, keyword co-occurrence, document co-citation network, and landmark articles. 
Analysis. Our study was guided by scientometric approaches implemented in CiteSpace, a visual 
analytic system based on networks of co-cited documents. We also employed a dual-map overlay 
technique to investigate epistemological characteristics of the domain. 

Results. We found that the investigation of algorithmic and linguistic aspects of opinion mining has 
been of the community’s greatest interest to understand, quantify, and apply the sentiment 
orientation of texts. Recent thematic trends reveal that practical applications of opinion mining such 
as the prediction of market value and investigation of social aspects of product feedback have 
received increasing attention from the community. 

Conclusion. Opinion mining is fast-growing and still developing, exploring the refinements of 
related techniques and applications in a variety of domains. We plan to apply the proposed analytics 
to more diverse domains and comprehensive publication materials to gain more generalized 
understanding of the true structure of a science. 


Introduction 

Opinion mining refers to the task of finding opinions of people about specific 
entities t Feldman. 2012 L It employs computational linguistics and natural 
language processing to identify and extract subjective information from source 
texts. Sentiment analysis, which parallels opinion mining, aims to quantify and 
determine the polarity of an individual with regards to judgement, affective state, 
or emotion f Pang and Lee. 2008 k Sentiment analysis is used to automatically 
analyse evaluative texts and materials. Opinion mining and sentiment analysis are 
often seen as interchangeable concepts and express a mutual meaning f Medhat. 
Hassan. and Korashv. 2014 k Pang and Lee (2004) described the year of 2001 as 
the beginning of widespread awareness of opinion mining and sentiment analysis. 
Most of the early studies placed emphasis on proposing computational techniques 
to measure the sentiment orientation of product reviews in a binary way f 'l'het. 

Na. and Khoo. 2010 L Recent studies have been working on in-depth examination 
of source texts with a variety of techniques and units of analysis such as 
document, sentence, feature, or aspect. Within this context, these has been an 
exponentially increasing number of articles written on this topic (See Figure 2). 
Hundreds of startup companies are also developing sentiment analysis solutions 
and software packages t Feldman. 20TT L 

As the volume of the literature has exponentially grown, a wide range of 
communities has also paid large attention to opinion mining. Thus, a systematic 
analysis of the intellectual structure of the field is needed to understand its main 
applications and current challenges. The systematic domain analysis will allow the 


















followings. First, it helps the opinion mining community to be more self- 
explanatory as it has a detailed bibliometric profile. Secondly, researchers in the 
field can benefit from the systematic domain analysis by better positioning their 
research, identifying research trends, and expanding research territories. Finally, 
it guides researchers who are interested in the field to learn about emerging 
trends and current challenges. To our knowledge, however, there has been little 
concerted effort which aims to systematically understand the intellectual 
landscapes in opinion mining. The motivation of the present study lies in our 
intention to identify the intellectual structure of opinion mining and sentiment 
analysis research in a quantitative and systematic manner. We explored thematic 
patterns, landmark articles, emerging trends, and new developments of the field. 
In particular, our study was guided by a computational approach implemented in 
CiteSpace, a visual analytic system for illustrating emerging trends and critical 
changes in scientific literature f Chen. 2006 : Chen et ah. 2012 k We also employed 
a dual-map overlay technique f Chen and Levdesdorff. 2014 ) to expand the present 
study to the investigation of citation patterns at a disciplinary level. 

In what follows, three research questions frame our investigation: 

• Research question #1 : what are epistemological characteristics of opinion 
mining research? 

• Research question #2: which thematic patterns of research do occur in the 
domain? 

• Research question #3: what are emerging trends and recent developments 
in the field? 

The rest of the study is organized as follows. First, we survey preceding work. 
Then, the scientometric methodology of the study is introduced. We describe 
intellectual structure of the field and key findings. This is followed by the 
conclusion and proposed future study. 

Related work 

To date, there has been little research that has systematically investigated 
comprehensive intellectual landscapes and emerging research trends in opinion 
mining. Therefore, we examined review articles on opinion mining since review 
papers aim to survey and synthesize key findings and research trends in a specific 
domain of interest. Pang and Lee (2008) presented a detailed review on broad 
aspects of opinion mining and sentiment analysis. This survey featured 
applications of opinion mining and sentiment analysis such as classification and 
extraction of textual units and summarization of sentiment information from a set 
of single and multi-documents. Based on the authors’ expertise on the field, a 
large number of articles were surveyed. A list of classifiers and their performance 
in grouping opinions were discussed by Govindarajan and Romina (2013). They 
summarized the performance of naive Bayes, support vector machine, and genetic 
algorithm on a variety of data sources along with the selection of different 
features. Evolution of the field and its future research trends can be found at a 
work done by Cambria and colleagues (2013). They reviewed 1) opinion mining 
techniques ranging from heuristics to discourse structure, 2) different levels of 
analysis ranging from document-level to aspect-level, and 3) four approaches 
regarding the identification of implicit information associated with word tokens 
such as keyword spotting, lexical affinity, statistical methods, and concept-based 
approaches. The authors considered the multi-modal sentiment analysis as one of 
the promising approaches, laying stress on the importance of investigating a 
variety of sources such as textual, acoustic, and video features. Feldman (2013) 
discussed opinion mining techniques to solve five specific problems: 1) document- 
level sentiment analysis, 2) sentence-level sentiment analysis, 3) aspect-level 
sentiment analysis, 4) comparative sentiment analysis, and 5) sentiment lexicon 









acquisition. He introduced major applications of sentiment analysis such as 
customer review mining towards products and services, preference mining on 
candidates running for political positions, and market value prediction. Medhat 
and colleagues (2014) surveyed 54 research papers on algorithms and applications 
of sentiment analysis. They categorized these papers into six groups: 1) sentiment 
analysis, 2) sentiment classification, 3) feature selection, 4) emotion detection, 5) 
building resource, and 6) transfer learning. This paper showed that naive Bayes 
and support vector machine are the most frequently used techniques for grouping 
opinions. In addition, WordNet was said to be the most commonly used lexical 
source to solve sentiment analysis problems. 

The review articles discussed above examined a variety of aspects of opinion 
mining research. However, there has been little concerted effort to investigate 
emerging trends and recent developments in opinion mining in such a 
comprehensive, systematic, and computationally driven manner that we use in the 
present study. In addition, one commonality of these papers is that the surveyed 
articles were selected based on prior domain knowledge or without specific 
selection criteria. While aggregated and/or brief discussions on individual papers 
were provided, the implication and importance of individual papers to the field 
were missing. This may have helped readers approach individual papers more 
systematically and selectively for a deeper understanding. The absence of clear 
thematic categories is also one possible limitation of these reviews. Therefore, 
these surveys lack objectivity and clarity in describing the intellectual structure 
and emerging trends in the field. Since opinion mining is a developing, fast¬ 
growing domain, a systematic and comprehensive investigation of intellectual 
landscapes and recent developments is essential. To bridge these gaps, the present 
study aimed to explore the intellectual structure of the field through a bibliometric 
analysis of extensive scientific literature, taking a variety of units of analysis into 
account. We also took a close look at landmark articles of the field that are chosen 
by multi-faceted criteria. 

Methods 

In this section, we introduce the data collection method and analytical approaches 
of the present study. The research procedure is shown in Figure 1. 
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Figure 1: Research procedure 


Data collection 


The goal of the present study was to explore the intellectual structure of opinion 




























mining research. We aimed to identify thematic patterns, research networks, 
emerging trends and new developments, together with landmark articles in the 
domain. Toward that end, we retrieved bibliographic records from the Web of 
Science, using a topic search. There were two problems that we encountered with 
this data collection approach per se. First, it is acknowledged that the topic search 
does not include relevant records if querying terms do not appear in the targeted 
fields such as titles, abstracts, and keywords. In addition, despite the fact that 
conference proceedings are a common document type in computer and 
information-related sciences, the Web of Science under-represents publications 
from conferences I Kim and Chen. 201^ 1. To remedy these issues, we employed a 
two-step complementary approach in data collection. First, we assumed that if an 
article cites at least one of the records retrieved by the topic search, then the 
article is topically relevant to opinion mining I Chen. Hu. Liu, and Tseng. 2012 k As 
Garfield ( Tq7q 1 argues, citation indexing is an alternative strategy to capture a 
much broader context of a study. Therefore, we collected additional records, using 
a citation expansion implemented on the Web of Science. The dataset obtained by 
the topic search was regarded as the core dataset and the expanded one represents 
a broader context of the core. Using the citation expansion supports our goal to 
explore much broader landmark references influencing emerging trends and 
recent developments in opinion mining. Second, we triangulated our data 
collection by conducting patent and citation searches on Derwent Innovations 
Index. This database indexes technological inventions in chemical, electrical, 
electronic, and mechanical engineering. 

The detailed procedure of our data collection is as follows. First, a data collection 
method proposed by preceding scientometric research I Chen. Dubin. and Kim. 
2014a . b; Kim and Chen. 2017 : Kim. Zhu. and Chen. 2016 I was used. For this 
basic search, four query phrases identified as representative for the domain were 
used: “opinion mining” OR “sentiment anal*” OR “subjectivity anal*” OR “review 
mining”. We employed the wildcard * to capture relevant variations of a word, 
including no character, such as sentiment analysing and sentiment analytics. The 
use of double quotation marks helped queries considered to be clauses. Then, a 
record was regarded as relevant if any of the terms is found in the title, abstract or 
keyword fields of the record. The queries resulted in 2,010 records from 2002 
through 2015 as of August 31 2016 when considering articles, proceeding papers, 
and review articles as representative record types. Then, the core records were 
cited by 3,418 articles, proceedings, and review articles on the Web of Science. 
Finally, the query set returned 168 patent records between 2005 and 2015 from 
Derwent Innovations Index. We integrated all of these data sets and used the 
merged one for the present scientometric investigation. Table 1 describes the brief 
statistics of the dataset. Figure 2 depicts the number of records over time. As 
depicted in the figure, opinion mining has received exponentially increasing 
attention from the community. Table 2 describes the query terms used to search 
records and the number of corresponding records to each term. It shows 
“sentiment anal*” contributes the literature most. 


Dataset 

DurationlResultsI Articles 

Proceedings! Reviews 

Authors 

Ref e ren ces I Key wo rd s 

Core 

2002- 

2015 

2,010 

643 

1,322 

45 

6,470 

52,948 

12,815 

Expanded 

2005- 

2016 

3,418 

2,170 

1,222 

126 

11,354 

150,503 

31,393 

Patent 

2005- 

2015 

168 

- 

- 

- 

577 

936 

1,047 


Table 1: Data statistics 
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Figure 2: Records distribution overtime 


Query term 

Core 

dataset 

Expanded 

dataset 

Patent 

dataset 

sentiment anal* 

1,615 

2,972 

149 

opinion mining 

767 

1,591 

17 

subjectivity 

anal* 

34 

213 

1 

review mining 

30 

76 

1 


Table 2: Query terms and retrieved records (in descending order of the number 
of core records) 

Investigation of the intellectual landscapes, emerging 
trends, and new developments in opinion mining 

Scientometrics is the quantitative study of science. We employed scientometric 
approaches to analyse bibliographic records for the present domain analysis. 
There are several advantages in our study in comparison with a conventional 
domain analysis as follows f Chen. Duhin. and Kim. 2014b ). Firstly, a much 
broader and more diverse range of relevant topics can be explored due to the use 
of citation expansion and patent records. Secondly, this kind of domain 
investigation can be conducted as frequently as needed, although an inquirer does 
not have prior expertise in a targeted domain. Finally, a scientometric analysis 
provides an additional point of reference. 

CiteSpace, developed by Chen (2006), is a scientometric toolbox to generate and 
analyse networks of co-cited references using bibliographic records. The input is a 
collection of academic literature relevant to a specific topic. Given bibliographic 
records , this tool computationally detects and depicts thematic patterns and 
emerging trends in science. On top of this, CiteSpace provides a visual 
representation called a dual-map overlay which renders a domain-level view of 
the growth of the literature f Chen and Levdesdorff. 201/0 . 

In a scientometric study, the intellectual landscapes of a scientific domain can be 
represented by a variety of networked entities such as cited references and 
keywords f Chen. Duhin. and Kim. 2014b ). Specifically, we focused on document 
citation networks and networks of co-occurring keywords to explore emerging 
trends and recent developments in opinion mining research. The key features of 
the present investigation include: 1) domain-level citation paths, 2) frequently 
assigned subject categories, 3) keyword co-occurrence, 4) networks of highly cited 
references, and 5) influential articles selected by a variety of metrics. 



















Terminology 


In order to clearly communicate the technical approaches and findings of the 
present study to the audience, we introduce structural measures and text mining 
techniques used in this paper as follows: 

g-index: The g-index, suggested by Egghe (2006), is an author-level metric to 
measure the scientific productivity of an individual. It considers the unique largest 
number of the top g highly cited articles received together more than or equal to g 
square citations. Examining the entire entities in our data, i.e. keywords and cited 
references, may be computationally challenging. It may not intuitively 
communicate the true structure of the domain to the audience as well. In addition, 
we argue that employing the g-index in selecting core entities is sounder than 
using the h-index since the g number of cores is always at least as big as the h 
cores. Thus, we used g-index to select a significant fraction of frequently occurring 
keywords and cited articles within a l-year slice of time . 

Pathfinder networks: Bibliographic networks can be highly dense with many links 
between entities. Network pruning or link reduction, the process in order to 
systematically remove excessive links, can address this issue. Based on the 
proximity of entity pairs, pathfinder networks capture the shortest paths, so links 
are eliminated when they are not on shortest paths f Schvaneveldt. Durso. and 
Dearholt. iqSq L In this study, we employed pathfinder networks to eliminate 
redundant links between entities of analysis. 

Betweenness centrality: Betweenness centrality is an indicator of a node 
considering the number of shortest paths from all vertices to all others that pass 
through the node f Brandes. 2001 ). A node with a high value of betweenness 
centrality has a large influence on the transfer of information through the network 
f Chen. 2011 k If a node provides the only link between two large but previously 
unconnected groups of nodes, it would have a very high degree of betweenness 
centrality. In our study, we regarded this topological property as a significant sign 
of a bibliographic entity’s influence. 

Burstiness: Kleinberg (2002) proposed an algorithm called burst detection which 
captures the burstiness of events with certain features rising sharply in frequency. 
Based on this concept, an entity can be regarded as having bursting activities if it 
shows an intensive frequency of appearance during a specific duration of time. It 
overcomes the limitation of just considering the cumulative number of metrics 
such as citations as a measure of an entity’s impact. 

Sigma: Sigma is a measure identifying scientific publications with topological 
burstiness. It is defined as (betweenness centrality + 1) to the power of burstiness 
such that the temporal brokerage mechanism plays more prominent role than the 
rate of raw citations f Chen. etal. 2QOq 1. We regarded this metric as another 
important sign of a bibliographic entity’s structural burst. 

Automatic cluster labelling: In order to automatically label clusters of cited 
references, we extracted candidate terms from titles and abstracts of citing 
articles. In CiteSpace, these terms are selected by three different algorithms: 1) 
latent semantic indexing (LSI) f Deerwester. etal.. looo l. 2) log-likelihood ratio 
(LLR) f Dunning. loot ! and 3) mutual information (MI). Labels extracted by LSI 
tend to capture implicit semantic relationships across records, whereas those 
chosen by LLR and MI tend to reflect a unique aspect of a cluster f Chen. Ibekwe- 
San.Tuan. and Hou. 2010 L 


Results 













Research trends at a disciplinary level 


A dual-map overlay is an analytical approach that presents the domain-level 
concentration of citations through their reference paths f Chen and Levdesdorff. 
2014 !. A base map depicts the interconnections of over 10,000 published journals 
and these journals are grouped into some regions that represent publications and 
citation activities at a domain level f Chen. Dubin. and Kim. 2Qi4a l. The dual-map 
overlay of opinion mining research is displayed in Figure 3. In the visual 
representation, the left clusters, i.e. research fronts, represent where the retrieved 
records publish, while the right clusters indicate where they cite. Regions are 
labelled by common terms found in the underlying journals. Citation trajectories 
are distinguished by citing regions’ colours. The thickness of these trajectories is 
proportional to the z-score-scaled frequency of citations. Based on this map, we 
can identify patterns of how published articles in opinion mining refer to other 
intellectual bases (cited references). As rendered in the figure, there are four main 
citation paths in our dataset. Table 3 summarizes these paths with citing and cited 
region names. 
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Figure 3: Domain-level citation patterns in opinion mining research 


Citing region 

Cited region 

z- 

score 

psychology, education, health 

psychology, education, 
social 

7.729 

mathematics, systems, 
mathematical 

systems, computing, 
computer 

5.803 

psychology, education, health 

systems, computing, 
computer 

2.727 

mathematics, systems, 
mathematical 

psychology, education, 
social 

2.468 


Table 3: Citation trends at a domain level 

The relationships are sorted by the z-scores in descending order where the values 
are rounded to the nearest thousandth. Each row is identified by the same colour 
of the corresponding path displayed in Figure 3. As described in Table 3, the 
domains the most frequently covering the records are: 1) 6. psychology, 
education, health, and 2) 1. mathematics, systems, mathematical. Then, this 
literature is mostly influenced by 7. psychology, education, social and 1. systems, 
computing, computer. On top of these domains, 5. physics, materials, chemistry 
(rendered in purple lines), 2. molecular, biology, immunology (depicted in yellow 
lines), and 4. medicine, medical, clinical (illustrated in green lines) contribute to 
the domain-level citation trends in opinion mining. It indicates a 
multidisciplinary aspect of opinion mining since publications from multiple 


















domains contribute to the citation landscape of the domain. It also shows that 
social sciences study opinion mining and sentiment analysis within similar 
disciplinary contexts while the mathematical and algorithmic concepts of the 
domain are mostly influenced by the literature that investigates systems and 
computer (See the first and second rows of Table 3). Based on these observations, 
we argue that opinion mining research is partially multidisciplinary and partially 
monodisciplinary. We also identify an interdisciplinary characteristic as described 
at the third and fourth rows of Table 3. It is shown that psychological, educational, 
and biomedical investigations in opinion mining are based on systems 
perspectives of computing and vice versa. 

A subject category of an article can also be regarded as evidence showing an 
upper-level thematic concentration of the article. In order to specify the findings 
above, we examined the Web of Science subject categories assigned to the records. 
Table 4 describes 20 subject categories most frequently given to the dataset. The 
table shows each category’s year of first occurrence, cumulative frequency, and 
betweenness centrality in descending order of the assignment frequency (third 
column). As described in the table, computer science, engineering, linguistics, and 
social sciences such as library and information science, business, economics, 
management, and psychology are among the leading subject categories in opinion 
mining. Among them, computer science, engineering, and their related fields have 
been assigned to the records most frequently. In addition, the category, 
interdisciplinary applications of computer science, shows to have the highest 
betweenness centrality of 0.21. It indicates that interdisciplinary applications of 
computer science have had the largest influence on the emergence, development, 
and diffusion of the ideas in opinion mining. In turn, psychology plays an 
important bridging role between domains that participate in opinion mining 
research (betweenness centrality: 0.20). It is also obvious that the publications 
from electrical engineering, artificial intelligence, linguistics, and library and 
information science have transferred scientific findings to opinion mining. 
Recently, opinion mining has received significant attention from operations 
research and management. It is also interesting to see that business and 
economics have published literature from the very early years of the domain. This 
may indicate that opinion mining started with an aim of understanding and 
scaling customers’ subjective opinions toward products. Based on these 
observations, we assume that the researchers in the domain began by 
investigating algorithmic, linguistic, and psychological aspects of opinion mining 
in order to understand, quantify, and measure the sentimental orientation of 
texts. Successive literature may have explored the improvement and practical 
applications of the techniques to multiple domains. In the next subsections, we try 
to understand these findings deeper at the keyword and reference levels. 


Category 

First 

occurrence 

Frequency Centrality 

cs 

2003 

3,453 


cs, artificial intelligence 

2003 

1,769 


cs, information systems 

2005 

1,671 


cs, theory & methods 

2007 

1,303 


engineering 

2007 

1,029 


engineering, electrical 

2007 

888 


cs, interdisciplinary 
apps. 

2004 

423 

0.21 

info. sci. & lib. sci. 

2005 

348 


business & economics 


332 


ops. res. & mgmt. sci. 

2008 

327 


cs, software engineering 

2006 

326 


linguistics 

2004 

251 


language & linguistics 

2004 

227 

















business 

2006 

188 

0.06 

telecommunications 

2008 

182 

0.06 

cs, hardware & 
architecture 

2007 

182 

0.01 

management 

2006 

177 

0.04 

robotics 

2008 

145 

0.00 

cs, cybernetics 

2006 

108 

0.04 

psychology 

2006 

106 

0.20 


Table 4: Top 20 frequently assigned subject categories In the dataset 

Keywords as indicators of emerging trends and new 
developments 

The investigation of keywords can add richer interpretations to understanding the 
concentration of research themes since keywords represent underlying concepts 
of an article. Table 5 describes 20 keywords the most frequently given by authors 
and indexers to the records. It shows each keyword’s year of first occurrence, 
cumulative frequency, and betweenness centrality. Based on its frequency of 
appearance and centrality, it is evident that sentiment analysis is the keyword 
considered as the most representative of the literature from the early years of the 
domain. This keyword is followed by opinion mining which is the application of 
sentiment analysis to a variety of textual materials such as web and review. 
Following sentiment analysis (betweenness centrality: 0.20), emotion is also 
located on the shortest path connecting pairs of other concepts in opinion mining 
research (betweenness centrality: 0.19). It indicates that the identification and 
extraction of subjective information in source texts have been regarded as the 
most important concepts in opinion mining and had the largest influence on the 
growth of the domain. Technique-wise, text mining, machine learning, and 
natural language processing have been frequently employed to quantify 
sentiment analysis. Classification is among the most frequently investigated 
techniques, also being a goal of understanding the sentiment orientation of text. 
As reflected in the recent keywords such as social media, twitter, social network, 
and word of mouth, the literature now focuses on the practical applications of 
opinion mining to marketing and social networking platforms. 


Keyword 

First 

occurrence 

Frequency Centrality 

sentiment analysis 

2006 

1,458 

0.20 

opinion mining 

2006 

721 

0.09 

social media 

2011 

378 

0.07 

twitter 

2011 

340 

0.06 

classification 

2009 

303 

0.07 

text mining 

2007 

246 

0.05 

model 

2010 

242 

0.03 

social network 

2009 

207 

0.03 

machine learning 

2006 

194 

0.07 

emotion 

2005 

184 

0.19 

word of mouth 

2011 

182 

0.07 

web 

2010 

179 

0.02 

text 

2009 

178 

0.11 

information 

2010 

174 

0.02 

system 

2009 

165 

0.03 

network 

2010 

159 

0.03 

natural language 
processing 

2009 

154 

0.04 

review 

2011 

153 

0.01 

sentiment classification 

2006 

145 

0.09 

internet 

2007 

132 

0.04 









































Table 5: Top 20 frequently occurring keywords in the dataset 


Figure 4 (below) displays the co-occurrence network of these keywords. For the 
clarity of the figure, we only labelled keywords listed in Table 5. The size of a node 
is proportional to its frequency and each node is coloured by community 
membership (' Blondel. Guillaume. Lambiotte. and T.efehvre. 2008 k Finally, the 
more frequently keywords co-occur, the more closely they are located. As depicted 
in the figure, the formulation of the network is consistent with the findings above. 
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Figure 4: Keyword co-occurrence network 

Table 6 (below) describes 20 keywords receiving the most intensive attention 
during a specific span of time. We sorted the keywords by the beginning years of 
the bursts so that we could identify temporal patterns. Opinion mining and 
sentiment analysis are among the most intensively bursting keywords across the 
entire time duration. Sentiment classification has been one of the most important 
goals from the earlier years. Opinion extraction is the task of automatically 
extracting structured opinions from unstructured data such as text. For sentiment 
classification and opinion extraction, machine learning and text mining have 
been employed from the community. In the previous section of examining trends 
in subject category assignment, we assumed that literature in an earlier stage may 
have explored linguistic aspects sentiment orientation. The burstiness of the 
keyword language complements this assumption. A variety of techniques to 
extract sentiment and to apply quantified subjectivity have shown bursting trends 
with language (See information retrieval and information extraction ). Then, the 
automatic collection of a corpus from consumer contributed content such as blog 
has had large attention and natural language processing has been intensively 
employed to this end. Recently, there has been growing interest in applying data 
mining and sentic computing to a variety of textual genres such as web and 
microblogging. Thus, investigating the burstiness of keywords adds richer 



interpretations to the understanding of the emerging trends in opinion mining 
than just considering the cumulative number of keyword occurrence. 


Keyword 

Burst 

Begin 

End 

2002-2015 

opinion mining 

52.860 

2006 

2012 


sentiment analysis 

8.549 

2006 

2011 


sentiment 

classification 

7.197 

2006 

2011 


machine learning 

4.859 

2006 

2009 


text mining 

8.486 

2007 

2011 


opinion extraction 

3.621 

2007 

2011 


language 

9.148 

2008 

2012 


information retrieval 

4.798 

2008 

2011 


information 

extraction 

3.619 

2008 

2010 


corpus 

2.729 

2009 

2012 


blog 

5.115 

2009 

2013 


natural language 
processing 

3.434 

2009 

2010 


opinion analysis 

3.684 

2009 

2011 


data mining 

3.700 

2010 

2012 


document 

3.796 

2010 

2013 


sentiment analysis 

7.981 

2010 

2012 


web 

7.511 

2010 

2012 


text analysis 

3.220 

2011 

2013 


microblogging 

4.547 

2011 

2012 


sentic computing 

3.922 

2011 

2012 



Table 6: Top 20 bursting keywords in the dataset (bursts rounded to the 
nearest thousandth) 

Document co-citation network 

Figure 5 (below) visualises the document co-citation network in the dataset. The 
network consists of g highly cited references within a given slice of time. Each 
node represents a cited article and the size of a node is proportional to its cited 
frequency. References with citation bursts are rendered with rings in red whereas 
nodes with purple rings have high betweenness centrality values. Landmark 
articles cited more than or equal to 133 times are labelled in black. Nodes are 
grouped in same lines by a clustering technique called smart local moving 
f Waltman and van Eck. 201t h Clusters are numbered in such a way that higher 
rankings are given to the clusters containing more references. The colour legend 
at the top indicates that links and citations in cooler colours happen closely to the 
year of 2002 whereas hotter ones occur in close years to 2015. Referring to the 
legend, we can keep track of the thematic trends and temporal developments in 
opinion mining research. 
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#0 sentiment analysis | key issues 
01 pessimists | opinion mining 
02 sentiment analysis | explicit sentiment analysis 
#3 critical framing | new york times readers opinions 
04 comparison | german 

05 pay-per-click campaign management | supervised sentiment analysis 
06 review readership | tourism online reviews 
07 domain adaptation | analyzing business ads 
08 dependency parsing 
09 general public 

010 e-commerce reviews management system 
011 Chinese sentiment analysis | conditional random fields 
012 online customer reviews | compound sentences 
013 learning subjective language 


Figure 5: Timeline visualization of the network (labelled by LSI) 


Considering both the size and recency of member nodes, we consider clusters #o 
through #4 as representing emerging research themes in the domain. Table 7 
summarizes these clusters in terms of number of member articles, three types of 
labels extracted from titles and abstracts, and average published year of citees. 
Among these clusters, cluster #0 is the largest and oldest one in terms of the 
cluster size and the mean year of cited references. The references in this cluster 
influenced successive research applying sentiment analysis to the investigations of 
social media and user requirements. It is interesting that even though social 
media is one of the recently appearing keywords (See Table 5), relatively old 
papers influenced research under this theme. Cluster #1 is the second largest and 
second oldest one. Literature citing references in this group focuses on mining 
opinion words and analysing customer preferences. Cluster #2, one of the most 
recent ones, consists of references which influenced successive research on 
developing a fuzzy system for the extraction of affective information. Cluster #3 
includes references affecting later literature on understanding newspaper readers’ 
opinions and sharing behaviour on social media. Cluster #4 studies the 
application of sentiment analysis to comparing specific products. It coheres with 
the findings from examining subject categories as reflected in management and 
business (See Table 4) and keyword assignment (See word of mouth in Table 5). 



Cluster labelling technique 


Cluster 

Size 

LSI 

Log- 

likelihood 

ratio 

Mutual 

information 

Mean 

year 

0 

44 

sentiment 
analysis | key 
issues 

social 

media 

user 

requirement 

2004 

1 

37 

Pessimists | 
opinion mining 

opinion 

word 

customer 

preferences 

analysis 

2008 

2 

35 

sentiment 
analysis | 
explicit 
sentiment 
analysis 

affective 

information 

fuzzy system 

2011 

3 

35 

critical framing 
| new york 
times readers 
opinions 

social 

media 

sharing 

behavior 

2011 

4 

26 

comparison | 
german 

electronic 

cigarette 

comprehensive 

empirical 

comparison 

2010 


























Table 7: Emerging clusters in the dataset 


Landmark literature 

Emerging trends and recent developments in a domain can be investigated in 
detail by surveying influential manuscripts of the discipline. For measuring the 
importance of an article, we considered a variety of indicators such as cumulative 
citation counts, intensity of citations during a specific span of time, topological 
importance, and burstiness of structural importance. Tables 8, 9,10, and 11 show 
landmark articles identified by these criteria. Landmark articles are extracted 
from cited references. In each table, manuscripts are coloured as follows: 1) green: 
articles that appear three times across all the tables, 2) yellow: articles that appear 
in two tables, and 3) white: articles that only appear in one table. In reviewing 
them, we chose only the articles that appear at least twice across the tables. In 
addition, some articles are not considered for this discussion if they are review 
papers, books, and book chapters. Topically irrelevant articles are also omitted. 
We argue that they do not influence thematic patterns and emerging trends in a 
domain. Eight papers selected in this way are summarized in Table 12. In the 
following paragraphs, we discuss each of eight papers in chronological order. 


Citation 

count 

Reference 

Cluster 

# 

888 

Pang and Lee (2008) 

5 

257 

Taboada, eta/. (2011) 

5 

256 

Liu (20101 

5 

190 

Bollen, Mao, and Zeng (2011) 

2 

174 

Thelwall, eta/. (2010) 

3 

166 

Ding, Liu, and Yu (20081 

1 

147 

Abbasi, Chen, and Salem (20081 

9 

144 

Wilson, Wiebe, and Hoffmann (2009) 

0 

142 

Liu (20121 

7 

133 

Cambria, Schuller, Xia, and Havasi 
(20131 

1 


Table 8: Top 10 highly cited articles in the dataset 


Burst 

Reference 

Cluster # 

51.24 

Turney (2002) 

11 

47.15 

Pang, Lee, and Vaithyanathan (2002) 

0 

46.48 

Turney and Liftman (2003) 

0 

35.94 

Wiebe, et at. (2004) 

0 

34.39 

Pang and Lee (2004) 

0 

28.45 

Dave, Lawrence, and Pennock (2003) 

10 

22.54 

Kim and Hovy (2004) 

0 

21.49 

Wilson (2005) 

0 

20.18 

Liu, Hu, and Cheng (2005) 

1 

13.27 

Pang and Lee (2005) 

10 


Table 9: Top 10 articles with the highest citation bursts 


Centrality 

Reference 

Cluster # 

0.14 

Jindal and Liu (2008) 

10 

0.12 

Pang and Lee (2005) 

10 

0.12 

Chevalier and Mayzlin (2006) 

6 

0.12 

Tang, Tan, and Cheng (2009) 

5 

0.11 

Liu, Hu, and Cheng (2005) 

1 

0.10 

Kanayama and Nasukawa (20061 

1 

0.10 

Thelwall, et at. (2010) 

3 

0.10 

Oiu. Liu. Bu. and Chen 120111 

1 









































































0.09 

Wilson (2005) 

0 

0.09 

Riloff, Wiebe, and Wilson (20031 
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Table 10: Top 10 articles with the highest betweenness centrality values 


Sigma 

Reference 

Cluster # 

26.48 

Pang, Lee, and Vaithyanathan (2002) 

0 

18.09 

Turney (20021 

11 

14.54 

Turney and Littman (2003) 

0 

7.62 

Liu, Hu, and Cheng (2005) 

1 

6.12 

Wilson (2005 ) 

0 

4.41 

Chevalier and Mayzlin (2006) 

6 

4.32 

Pang and Lee (2005) 

10 

4.21 

Kim and Hovy (2004) 

0 

3.90 

Riloff, Wiebe, and Wilson (20031 

0 

2.63 

Yu and Hatzivassiloglou (2003) 

0 


Table 11: Top 10 articles with the highest sigma 


Reference 

Cluster 

# 

Citation Burst Centrality Sigma 

Liu, Hu, and Cheng 
(20051 

1 


X 

X 

X 

Pang and Lee (2005) 

10 


X 

X 

X 

Turney (2002) 

11 


X 


X 

Pang, Lee, and 
Vaithyanathan 
(20021 

0 


X 


X 

Turney and Littman 
(20031 

0 


X 


X 

Riloff, Wiebe, and 
Wilson (2003) 

0 



X 

X 

Kim and Hovy (2004) 

0 


X 


X 

Thelwall, et at. 

(20101 

3 

X 


X 



Table 12: Select landmark articles in the dataset 

Turney (2002) proposed an unsupervised learning algorithm using mutual 
information to classify reviews into positive and negative groups. The method 
calculates the semantic orientation of a phrase, considering mutual information 
between the phrase and the sentiment groups, excellent and poor. The semantic 
orientation of a review is then classified compared to the average semantic 
orientation of included phrases. The proposed technique achieved an average 
accuracy of 74%, given a dataset consisting of documents on automobiles, banks, 
movies, and travels. Even though the method is simple and straightforward, the 
performance of the method needs to be further validated, due to the limited size of 
the dataset. 

Pang, Lee, and Vaithyanathan (2002) tested the performance of three machine 
learning algorithms: Naive Bayes, maximum entropy classification, and support 
vector machine on the sentiment classification of movie reviews. In testing these 
algorithms, they employed different feature sets such as unigrams, bigrams, and 
parts of speech. The support vector machine based on unigrams achieved the best 
performance, which reached 82.9%. The contribution of the work lies in its broad 
investigation of different features in classifying peoples’ opinions. The authors 
laid stress on the importance of identifying plausible features that screen whether 
sentences are on-topic. The application of these approaches to other domains 
might be helpful for strengthening the generalizability of the study. 





























































Turney and Littman (2003) introduced a method that infers the semantic 
orientation of words based on their statistical associations with other positive and 
negative words. Two co-occurrence based measures, pointwise mutual 
information and latent semantic analysis, were employed in the study. The 
evaluation showed an accuracy of 82% and it could be raised up to 95% when 
applied to mild words. The proposed approach can be applied to a variety of parts 
of speech such as adjectives, adverbs, nouns, and verbs. The limitation of the 
method is that it requires a large size of corpora to calculate significant statistical 
associations among words to achieve better performance. 

Riloff, Wiebe, and Wilson (2003) proposed a system that classifies subjective and 
object sentences. The system uses extraction patterns to learn subjective nouns 
using bootstrapping algorithms. Then, a Naive Bayes classifier was trained using 
the identified subjective nouns and other features such as discourse features and 
subjective clues. Results showed that the system achieved 77% recall and 81% 
precision. A contribution of the work is the use of bootstrapping methods for the 
identification of extraction patterns. 

Kim and Hovy (2004) proposed a system that determines the sentiment of 
opinions by examining people who hold opinions toward a given a topic. The 
system has modules for determining sentiments of words as well as sentences that 
contain those words. To achieve this, they started with selecting less than too seed 
words (both positive and negative) manually and expanded the list by extracting 
additional words from WordNet, and finally obtained 20,000 words with the 
polarity information. They proposed three models with different ways of combing 
sentiments of individual words: a model that only considers polarities, a model 
that uses the harmonic mean of sentiment strengths, and a model that uses the 
geometric mean of sentiment strengths. Evaluation showed that the highest score 
the system achieved was 81% accuracy. 

Liu, Hu, and Cheng (2005) proposed a system that analyses online customers’ 
reviews toward products. The system supports a feature-by-feature comparison of 
customer opinions. A language pattern mining-based approach was proposed for 
the extraction of product features. This system provides a visual comparison of 
consumer opinions on competing products. A contribution of this work is the 
proposal of a supervised pattern discovery method that supports the automated 
identification of products features from Pros and Cons. 

Pang and Lee (2005) proposed a meta-algorithm for rating-inference problem 
that take multi-point scale of rating into account. Instead of classifying a review as 
either positive or negative, they viewed the problem as a multi-category 
classification. They compared the performance of three approaches: multiclass 
SVM (OVA), regression, and metric labelling on the task. While there was no 
single approach that performed best in every cases, each approached showed its 
own strength in different tasks. 

Thelwall and colleagues (2010) proposed a novel algorithm to extract sentiment 
strength from informal English text. They employed a machine learning approach 
to optimize sentiment term weightings. The authors extracted sentiment 
information from texts of nonstandard spelling. An evaluation was performed on 
MySpace comments and results showed that the proposed method achieved 60% 
and 72% accuracy in predicting positive and negative emotion respectively. 

The literature discussed here has examined how to define and quantify the 
semantic orientation of texts, using lexicon- and/or machine learning-based 
approaches. Lexicon-based approaches examine words or phrases in a document 
and use metrics such as mutual information to judge their semantic orientation. 
Machine learning-based techniques employ widely accepted algorithms such as 








Naive Bayes and support vector machines. These techniques train classifiers with 
labelled sentiment. On top of this, classification was among the most frequently 
employed techniques. In general, lexicon-based techniques show lower 
performance but higher applicability, i.e. can be applied to many domains, while 
machine learning-based techniques outperform, but are only applicable to limited 
domains. These approaches have been applied to a variety of units of analysis, 
ranging from individual words to complete reviews that are comprised of multiple 
sentences. 

To sum up, most of the landmark papers focused on the computation and 
grouping of sentiment orientation. Even though often context-specific, they 
provide practical implications towards the understanding and applications of 
sentiment analysis. These findings are also consistent with the findings in 
previous sections. 

Discussion 

Research question #1: what are epistemological 
characteristics of opinion mining research? 

The investigation at the domain level reveals the following epistemological 
characteristics of opinion mining research. First, most of the studies came from 
two domain groups, psychology, education, health and mathematics, systems, 
mathematical, and each of these disciplinary groups cites research within similar 
fields, psychology, education, social and systems, computing, computer, 
respectively. The literature from other domains such as physics, materials, 
chemistry, molecular, biology, immunology, and medicine, medical, clinical also 
contribute to the domain-level citation trends. Second, we also identified the 
interdisciplinary nature of opinion mining: the two groups of domains frequently 
cross-reference to each other. Next, the advent and growth of opinion mining was 
mainly led by a few subject categories. Interdisciplinary applications of computer 
science and psychology were the driving forces that have had the largest influence 
on the emergence, development, and diffusion of the ideas in opinion mining. 
Moreover, opinion mining is relatively new and still developing. There are a few 
emerging research themes identified from keyword co-occurrence and document 
co-citation network. Finally, opinion mining has received large attention from 
multidisciplinary domains recently. 

Research question #2: which thematic patterns of 
research do occur in the domain? 

First, the subject category assignment shows that computer science, engineering, 
linguistics, and social sciences such as library and information science, business, 
economics, management, and psychology are among the leading categories in 
opinion mining research. In particular, computer science, psychology, electrical 
engineering, artificial intelligence, linguistics, and library and information have 
transferred important knowledge to the domain. Second, the exploration of 
keywords adds a richer interpretation: the identification and extraction of 
subjective information in source texts and the applications of sentiment analysis 
to a variety of textual materials have been regarded as one of the most important 
themes of the domain. The detection of emotion from source texts was the concept 
located on the shortest path connecting pairs of other concepts in opinion mining 
research. Machine learning, text mining, data mining, and natural language 
processing have been employed to this end. Classification is among the most 
frequently investigated techniques, also being a goal of understanding the 
sentiment orientation of text. The automatic collection of a corpus from a variety 
of textual genres such as web, blog, and microblogging has received large 


attention for opinion mining purposes. Finally, the examination of the landmark 
articles reveals that investigating algorithmic and linguistic aspects of sentiment 
analysis has been of the community’s greatest interest: they have put stress on 
understanding, quantifying, and applying the sentimental orientation of texts. 

Research question #3: what are emerging trends and 
recent developments of the field? 

Recently, opinion mining has received large attention from many 
multidisciplinary domains and recent literature has explored the practical 
applications of opinion mining to marketing and social networking platforms. 
Bursting keywords such as word of mouth, social media, social network, and 
twitter evidence this trend. The analysis of document co-citation network 
identifies that emerging clusters of research include l) understanding consumer 
attitudes for effective online marketing and prediction of a product’s market 
value, 2) developing a system for extracting affective information from text, 3) 
investigating newspaper readers’ opinions and information sharing behaviour on 
social media. In recent years, landmark manuscripts also have explored the 
improvement and applications of opinion mining to a variety of domains such as 
informal English text. 

Conclusion and future work 

In the present study, we aimed to explore the intellectual structure of opinion 
mining research in a systematic and comprehensive way. Toward that end, we 
investigated domain-level citation trends, assignment of subject categories, 
keyword co-occurrence, document co-citation network, and landmark articles. 
Based on the findings from the study, we articulate that the field of opinion 
mining is still emerging. For example, a few domains have participated in 
publication while they mainly cite similar kinds of research. They less frequently 
cross-cite each other. Next, the advent, development, and diffusion of opinion 
mining have been largely led by a few domains such as computer science, 
engineering, linguistics, psychology, and library and information science. These 
domains have explored algorithmic and linguistic aspects of sentiment analysis to 
understand, quantify, and apply the sentiment orientation of texts. Recently, 
multidisciplinary domains have participated in studying opinion mining and this 
body of literature has explored the practical applications of opinion mining such 
as to marketing and analysing social networking platforms. 

The approaches of the present study provide advantages in investigating 
intellectual structure of a science as follows. First, we systematically triangulated 
data collection. Conventional studies of domain analysis often cover only a 
fraction of published literature. The combination of topic search, citation 
indexing, and patent search in our method provides a systematic way to broaden 
the coverage of a knowledge domain, thus provides a much broader and more 
integrated context. Second, we investigated the domain from a multi-faceted point 
of view. Bursting keywords, document co-citation networks, emerging thematic 
patterns, and citation trends were identified in this study. In particular, we 
employed the concept of burstiness for measuring the importance of units of 
analysis. An entity's frequency of occurrence is a significant indicator reflecting its 
impact. However, we cannot identify its influence or density of that impact during 
a particular span of time if only considering the cumulative measure. Instead, we 
argue that using the concept of citation and keyword bursts is much more 
convenient in understanding the advent and decline of scientific trends. Finally, 
the analytical procedure and tool used in the present study enabled us to explore 
time-aware research trends in the domain. The scientometric approach and tool 
employed in this study can support comprehensive data collection and scalable 


analytics. In addition, one can conduct domain analysis of his or her concern as 
frequently as needed without prior knowledge. Thus, the proposed approaches 
have advantages such as having a relatively higher reproducibility and lower cost 
for conducting studies at a larger scale, especially as the number of publications in 
sciences is fast-growing. 

There are still several challenges in our study despite the fact that we have a 
variety of methodological advantages described above. First, the topic search 
along with citation indexing still may not be able to capture some relevant 
records. Generally, the vocabulary mismatch is said to present a challenge for 
keyword-based search f Peerwester. Dnmais. Furnas. T.andauer. and Harshman. 
iqqo I. Second, the Web of Science as our source of data may have 
underrepresented conference proceedings, which is also reported to be an issue 
for disciplines such as social sciences and arts and humanities f Mongeon and 
Paul-Hns. 2016 T In addition, at the time of data collection, we could only collect 
records from the core collection of the Web of Science due to the institutional 
subscription. It might be possible that the number of records collected in each 
organization from the core collection of Web of Science varies due to different 
subscription policies. Some relevant records might have been omitted from our 
data sets accordingly. Additional sources such as Scopus are recommended for 
future refinements of this type of analysis. Unfortunately, at the time of the study, 
we did not have access to Scopus. Third, we selected g highly cited references for 
generating the intellectual landscapes. In spite of this metric’s authority, we might 
be too strict in extracting an important proportion of records. It may be worth 
conducting a separate study of the theoretical implications of using a variety of 
conceivable selection criteria. We also plan to apply this scientometirc approach 
to much more comprehensive records that cover a various type of publication 
materials. 
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